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Coronaviruses cause respiratory tract infections in humans and outbreaks of deadly pneumonia worldwide. Infections are initi- 
ated by the transmembrane spike (S) glycoprotein, which binds to host receptors and fuses the viral and cellular membranes. 
To understand the molecular basis of coronavirus attachment to oligosaccharide receptors, we determined cryo-EM structures 
of coronavirus OC43 S glycoprotein trimer in isolation and in complex with a 9-O-acetylated sialic acid. We show that the ligand 
binds with fast kinetics to a surface-exposed groove and that interactions at the identified site are essential for S-mediated viral 
entry into host cells, but free monosaccharide does not trigger fusogenic conformational changes. The receptor-interacting site 
is conserved in all coronavirus S glycoproteins that engage 9-O-acetyl-sialogycans, with an architecture similar to those of the 
ligand-binding pockets of coronavirus hemagglutinin esterases and influenza virus C/D hemagglutinin-esterase fusion glyco- 
proteins. Our results demonstrate these viruses evolved similar strategies to engage sialoglycans at the surface of target cells. 


in the Nidovirales order and are divided into four genera: a, 
B, y and 5. Two B-coronaviruses have caused outbreaks of 
deadly pneumonia in humans since the beginning of the 21* cen- 
tury. The severe acute respiratory syndrome coronavirus (SARS- 
CoV) emerged in 2002 and was responsible for an epidemic that 
spread to five continents with a fatality rate of 10% before being con- 
tained in 2003 (with additional cases reported in 2004). The Middle 
East respiratory syndrome coronavirus (MERS-CoV) emerged in 
the Arabian Peninsula in 2012 and has caused recurrent outbreaks 
in humans with a fatality rate of 35%. SARS-CoV and MERS-CoV 
are zoonotic viruses that crossed the species barrier using bats/palm 
civets' and dromedary camels’, respectively. Four other corona- 
viruses of zoonotic origin are endemic in the human population, 
accounting for up to 30% of mild respiratory tract infections and 
causing severe complications or fatalities in young children, the 
elderly and immunocompromised individuals’*. These viruses are 
HCoV-NL63 and HCoV-229E (a-coronaviruses) and HCoV-OC43 
and HCoV-HKU1 (f-coronaviruses). Currently, no specific antivi- 
ral treatments or vaccines are available to combat any human coro- 
navirus. Furthermore, future cross-species transmission events of 
coronaviruses seem likely, given the large reservoir found in bats”~’. 
Studying coronaviruses will therefore help in understanding the 
principles governing cross-species transmission and adaptation to 
humans and in preparing for putative future zoonotic outbreaks. 
Coronaviruses use homotrimers of the spike (S) glycoprotein 
to promote host attachment and fusion of the viral and cellular 
membranes for entry. S is the main antigen present at the viral 
surface and is the target of neutralizing antibodies during infec- 
tion. As a result, it is a focus of vaccine design. S is a class I viral 
fusion protein synthesized as a single polypeptide chain precursor 


( oronaviruses are large, positive-sense enveloped RNA viruses 


of approximately 1,300 amino acids*. For many coronaviruses, S is 
processed by host proteases to generate two subunits, designated 
S, and S,, which remain non-covalently bound in the pre-fusion 
conformation. The N-terminal S, subunit comprises four B-rich 
domains, designated A, B, C and D, with domain A or B acting as 
receptor-binding domains in different coronaviruses. The trans- 
membrane C-terminal S, subunit is the metastable spring-loaded 
fusion machinery’. During entry, S, is further proteolytically cleaved 
at the S,’ site, immediately upstream of the fusion peptide’’. This 
second cleavage step occurs for all coronaviruses and is believed 
to activate the protein for membrane fusion, which takes place via 
irreversible conformational changes'''*. In recent years, cryo-EM 
work led to the determination of coronavirus S glycoprotein ectodo- 
main structures in the pre-fusion and post-fusion states, providing 
snapshots of the start and end points of the fusion reaction®'*'>*. 
Cryo-EM structures of the SARS-CoV and MERS-CoV S glycopro- 
teins in complex with human neutralizing antibodies also informed 
about the mechanism of fusion activation”. 

HCoV-OC43 was isolated for the first time in 1967 from vol- 
unteers at the Common Cold Unit in Salisbury, United Kingdom. 
Molecular clock analysis of genome sequences suggested that 
HCoV-OC43 originated from a zoonotic transmission event of a 
bovine coronavirus (BCoV) and dated their most recent common 
ancestor between the 1890s and the 1950s*°””. HCoV-OC43, HCoV- 
HKU1, BCoV and porcine hemagglutinating encephalomyelitis 
virus (PHEV) use 9-O-acetyl-sialic acid (9-O-Ac-Sia) as a receptor, 
which is terminally linked to oligosaccharides decorating glyco- 
proteins and gangliosides at the host cell surface**”’. The S glyco- 
protein of these viruses mediates 9-O-Ac-Sia binding, whereas the 
hemagglutinin-esterase (HE) protein acts as receptor-destroying 
enzyme, via sialate-O-acetyl-esterase activity, to facilitate release 
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Fig. 1] Cryo-EM structure of the apo-HCoV-OC43 S glycoprotein. a, Ribbon diagrams of the HCoV-OC43 S ectodomain trimer in two orthogonal 
orientations. The individual protomers are each in a different color, and the glycans are rendered as dark blue spheres. b, Ribbon diagrams of the 
superimposed HCoV-OC43 (light pink) and HCoV-HKU1 (dark gray) B domains in two orthogonal orientations. The N and C termini are labeled. 


of viral progeny from infected cells and escape from attachment to 
non-permissive host cells or decoys*’*’. These properties are shared 
with the hemagglutinin-fusion-esterase (HEF) glycoproteins of 
influenza C and D viruses****~°°. 

Sialic acids are ubiquitous terminal residues of glycoconjugates 
and occur in a wide variety as a result of modifications of the core 
N-acetyl neuraminic acid molecule and of differences in glycosidic 
linkages’’~°. Previous biochemical work established that domain A 
of coronavirus S glycoproteins mediates attachment to oligosaccha- 
ride receptors, such as for HCoV-OC43 and BCoV, which interact 
with 9-O-Ac-Sia****', or MERS-CoV, which binds to «2,3-linked 
(and to a lesser extent to «2,6-linked) sialic acids, with sulfated 
sialyl-Lewis X being the preferred binder*. On the basis of the 
galectin-like fold of domain A of coronavirus S and mutational 
analyses, it was suggested that key saccharide-binding residues 
locate to the viral membrane distal side of the BCoV B-sandwich. 
Our recent work, however, indicated that the 9-O-Ac-Sia binding 
site of HCoV-OC43, HCoV-HKU1, BCoV and PHEV is conserved 
among these viruses and resides at a distinct location of domain A”. 
Although we validated the findings using mutagenesis and BCoV 
infectivity assays, no structural information is available on the 
mechanism of coronavirus interaction with saccharides aside from 
in silico modeling*’. This knowledge gap limits our understanding 
of the roles of these receptors in viral infection or zoonosis and hin- 
ders the rational design of inhibitors. 

To understand attachment of coronaviruses to sialic acids at 
the surface of host cells, we determined cryo-EM structures of a 
stabilized HCoV-OC43 S glycoprotein trimer in isolation and in 
complex with 5-N-acetyl,9-O-acetyl-neuraminic acid «-methyl gly- 
coside (9-O-Ac-Me-Sia) at 2.9-A and 2.8-A resolution, respectively. 
We show that the ligand binds with fast association/dissociation 


kinetics in a groove on HCoV-OC43 S located at the surface 
of domain A. Site-directed mutagenesis combined with bind- 
ing experiments validated our structural findings, and infectivity 
assays showed that the residues involved in 9-O-Ac-Sia binding are 
essential for HCoV-OC43 S-mediated entry into host cells. Our 
results further show that binding to free 9-O-Ac-Me-Sia and/or 
acidic pH did not induce fusogenic conformational changes of S, 
suggesting that multivalent interactions with sialoglycans and/or 
further attachment to a putative proteinaceous receptor™ are essen- 
tial to promote membrane fusion. The receptor-interacting site is 
conserved in all coronavirus S glycoproteins known to attach to 
9-O-Ac-sialoglycans and shares architectural similarity with the 
ligand-binding pockets of coronavirus HEs and influenza virus 
C/D HEF glycoproteins, thus highlighting common structural 
principles of recognition*”®. 


Results 

Structure of the apo-HCoV-OC43 S glycoprotein. We determined 
a 2.9-A resolution cryo-EM reconstruction of an apo-HCoV-OC43 
S ectodomain trimer mutant, in which the S,/S, furin cleavage site 
was abrogated to prevent proteolytic processing during biogenesis. 
HCoV-OC43 § folds as a 150-A high and 130-A wide compact 
trimer (Fig. la, Supplementary Fig. la,b and Table 1). The S, 
subunit has a V-shaped architecture resulting from the 3D arrange- 
ment of its four domains (A, B, C and D), similarly to other 
6-coronavirus S structures®'”~°. The S, subunit, which is more con- 
served than the S, subunit among coronaviruses, folds as a mostly 
helical, elongated trimeric unit with a connector domain appended 
at its C-terminal end”’® (Fig. 1a). Among available coronavirus S 
glycoprotein structures, HCoV-OC43 S is most similar to mouse 
hepatitis virus (MHV) S? (r.m.s. deviation (r.m.s.d.) 4.7A over 
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sides, and a highly variable external subdomain that can mediate 
receptor engagement for SARS-CoV"’” or MERS-CoV**. Domain B 
of HCoV-OC43 and HCoV-HKU1 are structurally similar and can 
be superimposed with an r.m.s.d. of 1.0A over 251 aligned Ca posi- 
tions*’, with differences largely restricted to the external subdomain 


Table 1 | Cryo-EM data collection, refinement and validation 
statistics 


Apo-HCoV-OC43S_ Holo-HCoV-OC43 
(EMD-20070, S (EMD-0557, 


PDB 60HW) PDB 6NZK) (Fig. 1b). The current consensus in the field is that HCoV-OC43 S 
Data collection and processing does not rely on receptors other than 9-O-Ac-sialoglycans for pro- 
eeeiecten aa ae moting viral entry into host cells. In contrast with the MERS-CoV S 
and SARS-CoV S glycoproteins'**!”*>’, in which domain B adopts 
Voltage (kV) 300 300 alternative conformations, we detected a single closed conforma- 
Electron exposure (e~/A2) 70 70 tion of domain B in the HCoV-OC43 S structure (Fig. 1a). Only the 
Defocus range (um) 0.3-4.8 Gene closed domain B conformation was also observed for the MHV S’, 
Pixel size (A) 0.525 105 HCoV-NL63 S'°, HCoV-HKU1 S"*, PDCoV S'” and IBV S” glyco- 
protein structures. 
Symmetry imposed G3 Ce 

Initial particle images (no.) 197,791 332,912 Cryo-EM identification of a sialoside-binding site in the 
Final particle images (no.) 69,648 105,919 HCoV-0C43 S glycoprotein. HCoV-OC43, HCoV-HKU1, BCoV 
Mem reselulioa GS 59 58 and PHEV attach to the surface of target cells by binding to 9-O-Ac- 
sialoglycans**”’. To directly visualize the binding site and charac- 
ee enone one oie terize the molecular details of the interactions, we incubated the 
Refinement HCoV-OC43 S protein with 100mM 9-O-Ac-Me-Sia, prior to 
Initial model used (PDB code) 3JCL SHC. vitrification and cryo-EM data collection. We determined a 3D 
tocol eeoluriian C2) 30 29 reconstruction of the stabilized HCoV-OC43 S protein in complex 
ESC threshold ae as with its receptor at 2.8-A resolution, hereafter referred to as holo- 
HCoV-OC43 S (Supplementary Fig. 1c,d and Table 1). The resolu- 
Map sharpening B factor (A?) —61 =a tion estimate is supported by the visible ordered water molecules 
Model composition interacting with the S glycoprotein, as expected at this resolution”’. 
Nonhydrogen atoms 27,477 The structure reveals that the ligand interacts with a groove at the 
iat teeta we 3.519 3519 periphery of domain A, in agreement with the biochemical obser- 
vations reported by Hulswit et al* (Fig. 2a—c). The receptor there- 
Hila G : fore docks into a distinct groove from those used by either human 
Waters 186 396 galectin-3 (ref. °?) or the rhesus rotavirus sialic acid-attachment 
B factors (A2) protein” (VP8*) to recognize their respective ligands (Supple- 

Protein 18.6 12.3 mentary Fig. 2a-c). 

The sialoside-interacting groove defines two hydrophobic pock- 
rigatle ; ms ets, designated Pl and P2 (according to the nomenclature defined 
R.m.s. deviations by Hulswit et al**), separated by the Trp90 indole side chain, and is 
Bond lengths (A) 0.026 0.025 delineated by two loops forming the rims of the binding site, termed 
Bond angles (°) 180 182 L1 (27-Asn-Asp-Lys-Asp-Thr-Gly-32) and L2 (80-Leu-Lys-Gly- 
Validation Ser-Val-Leu-Leu-86) (Fig. 2c). The 9-O-Ac-Me-Sia Cl-carboxylate 

forms a salt bridge with the Lys81 side chain amine and a hydrogen 
MolProbity score Ol 08 bond with the Ser83 side chain hydroxyl (Fig. 2c and Supplementary 
Clashscore 0.6 1 Fig. 3). The 5-nitrogen atom of the ligand makes a hydrogen 
Poor rotamers (%) 0.4 0.4 bond with the Lys81 backbone carbonyl (Fig. 2c). The ligand 
Ramachaceraniolot N-acetyl methyl inserts into the P2 hydrophobic pocket, defined 

by residues Leu80, Trp90 and Phe95. The ligand 9-O-acetyl 
PeNvoeel Cy Ho ge methyl docks in the P1 hydrophobic pocket, which comprises 
Allowed (%) 100 99.9 Leu85, Leu86 and Trp90, whereas the 9-O-acetyl carbonyl 
Disallowed (%) 0 01 makes a hydrogen bond with the Asn27 side chain amide. These 


979 aligned Cu positions) and to HCoV-HKU1 S$" (r.m.s.d. 4.5A 
over 949 aligned Ca positions), sharing 62% and 68% sequence 
identity, respectively. The cryo-EM reconstruction resolves 14 
N-linked glycans extending from the surface of each protomer. 
The HCoV-OC43 S oligosaccharide density is comparable to that 
of SARS-CoV S and MERS-CoV S, with all three viruses 
belonging to the B-genus, but lower than the glycan density of 
the porcine delta coronavirus S (5-genus) or the HCoV-NL63 S 
(a-genus) glycoproteins'’®'””. 

Domain B shows the highest variability within S, subunits across 
coronaviruses, which correlates to the ability of different viruses 
to interact with distinct host receptors. For B-coronaviruses, the 
canonical architecture of domain B comprises a conserved five- 
stranded anti-parallel B-sheet, decorated with a-helices on both 
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observations rationalize the specificity of HCoV-OC43 S for 
this sialoside, because the 9-O-acetyl group is accommodated 
by a combination of hydrogen bonding and shape complementar- 
ity (Fig. 2b,c), similarly to 9-O-Ac-Sia binding sites of coronavi- 
rus, torovirus and orthomyxovirus HEs/HEFs**>**. Although 
most interactions occur with the same side of the ligand, the side 
chain hydroxyl of residue Thr31, which faces the 9-O-Ac-Me- 
Sia solvent-exposed side, forms a hydrogen bond with the Trp90 
indole nitrogen. This interaction participates in stapling the A 
domain N-terminal segment to the B-sandwich core and contrib- 
utes to defining the shape of the ligand-binding groove (Fig. 2c). 
Overall, the ligand buries 350 A? of its surface upon binding to 
the HCoV-OC43 S protein, corresponding to approximately 62% 
of the 9-O-Ac-Me-Sia total solvent-accessible surface area. The 
observed binding mode is compatible with interactions with lon- 
ger oligosaccharides, including «2,3- and «2,6-linked sialoglycans 
found on cell surfaces. 
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Fig. 2 | Identification of a sialoglycan-binding site in the holo-HCoV-OC43 S glycoprotein structure. a, Molecular surface representation of the holo- 
HCoV-OC43 S ectodomain trimer structure with the bound ligand shown as sticks. Protomers are individually colored. b, Surface representation of the 
ligand-binding site colored by electrostatic potential from —12 to +12 k,T/e,. ¢, Two orthogonal views of the 9-O-Ac-Me-Sia binding site. The A domain 

is rendered as a ribbon diagram with the side chains of surrounding residues shown as sticks. The cryo-EM density is shown as a blue mesh. In a—e, the 
ligand is rendered as sticks with atoms colored by element (carbon, gray; nitrogen, blue; oxygen, red). Dashed lines show a salt bridge and hydrogen bonds 


formed between the ligand and domain A. 


HCoV-0C43 S binds 9-O-Ac-Sia with fast association and disso- 
ciation rates. To characterize the binding kinetics and affinity of an 
individual HCoV-OC43 S binding site for 9-O-Ac-Sia receptors, we 
recombinantly produced the monomeric HCoV-OC43 S domain A 
and used biolayer interferometry to analyze its attachment to bio- 
tinylated oligosaccharides immobilized on the surface of streptav- 
idin-coated biosensors. Domain A bound to and dissociated from 
6-sialyl-5-N,9-O-acetyl-lactosamine (9-O-Ac-6SLN) with fast on 
and off rates. (Fig. 3a). The observed binding was specific, as it 
was critically dependent on the presence of the sialate-9-O-acetyl 
moiety, in accordance with previous observations**»***’’. Domain 
A did not detectably bind to the corresponding non-O-acetylated 
oligosaccharide, 6SLN. This finding is explained by the absence of 
the 9-O-acetyl moiety in 6SLN, which contributes one-third of the 
total ligand buried surface area by contacting Asn27 and the P1 
pocket of the glycoprotein, as revealed in our structure (Fig. 2b,c). 
Moreover, binding was largely abolished by de-O-acetylation of 
biosensor-bound 9-O-Ac-6SLN with porcine torovirus HE (Fig. 3a). 
Finally, substitution of Trp90 with alanine abrogated interactions 
with 9-O-Ac-6SLN (Fig. 3a), thereby confirming the central role for 
sialoside attachment of this amino acid residue that defines the floor 
of the ligand-binding groove”. 


Using steady-state analysis, we determined an equilibrium 
dissociation constant K,=49.7 + 10.7uM for the HCoV-OC43 
domain A—9-O-Ac-6SLN complex (Fig. 3b,c). We calculated a half- 
life of t,,,=0.7s from the dissociation curves, a dissociation rate 
constant k,,=1s7' (kj;=t,,/In,) and an association rate constant 
kon = 1.4 10* M's“. These values predict rapid S-mediated virion 
attachment, particularly in high-density receptor environments 
such as the mucus layer, glycocalyx and cell surfaces. On the basis 
of these results, the mean life (1/k,,-) of the 1:1 complex is predicted 
be short, in the order of 1s, much shorter than the mean life of an 
individual influenza A hemagglutinin receptor-binding domain 
in complex with sialic acid, which ranges between 7 and 13.5 s”. 
In the context of authentic virions, however, the large number of 
S glycoproteins at the surface of coronaviruses is likely to increase 
the apparent binding affinity for sialoglycans through avidity, as 
described for influenza virus’. We posit that HCoV-OC43 and 
related B-coronavirus S glycoproteins evolved to dynamically inter- 
act with host sialosides and avoid irreversible attachment to decoy 
receptors via HE-mediated virion elution. Dynamic binding in 
combination with receptor destruction could promote virion motil- 
ity by directional sliding diffusion through high-density interaction 
sites, as recently reported for influenza A and C viruses”. 
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Fig. 3 | The identified HCoV-OC43 S interactions with sialosides are characterized by fast kinetics and are required for viral entry. a, Biolayer 
interferometry showing binding of wild-type or W90A monomeric HCoV-OC43 domain A to immobilized 6-sialyl-5-N-acetyl,9-O-acetyl-lactosamine 
(9-O-Ac-6SLN), 6-sialyl-5-N-acetyl-lactosamine (6SLN) or HE-pre-treated 6-sialyl-5-N-acetyl,9-O-acetyl-lactosamine before binding (9-O-Ac-6SLN, 
pre-HE) or after a successful association/dissociation event (9-O-Ac-6SLN, post-HE). b, Binding of different concentrations of wild-type monomeric A 
domain to immobilized 9-O-Ac-6SLN. ¢, Steady-state affinity determination using the curves shown in b. HCoV-OC43 A engages 9-O-Ac-6SLN with a 
K,5=49.7 + 10.7 uM. d, Asn27, a key 9-O-Sia-interacting residue visualized in the holo-HCoV-OCA43 S glycoprotein structure was substituted with alanine, 
and binding was assessed using a solid-phase lectin binding assay. Data points are averages from three independent technical triplicates. The data are 
normalized relative to the wild type. e, Sialoside binding to the identified site is necessary for HCoV-OC43 S-mediated entry of pseudotyped VSV-AG 
particles into host cells. n=3 pseudovirus experiments (technical replicates). Data are normalized relative to wild type and shown as mean and s.d. of 
technical triplicates. f, Western-blot analysis of VSV-AG pseudotyped with wild-type or mutant HCoV-OC43 S. VSV-N was used as a quantitative control 


for the amount of virions analyzed. 


HCoV-OC43 S attachment to 9-O-Ac-sialoglycans is necessary 
for viral entry. Our structure rationalizes the results of our previ- 
ous study in which the effect of individual HCoV-OC43 S domain A 
substitutions was assessed using a solid-phase lectin binding assay”. 
Substitution of Lys81 or Ser83 with alanine completely abrogated 
binding, as expected on the basis of our holo- HCoV-OC43 S struc- 
ture, owing to disruption of the aforementioned electrostatic inter- 
actions with 9-O-Ac-Sia. Moreover, mutations of Leu80, Leu86 or 
Trp90 also disrupted binding, probably as a result of alteration of 
the Pl and/or P2 hydrophobic pockets accommodating the ligand 
9-O-acetyl and 5-N-acetyl methyl groups, respectively. On the basis 
of our structure, we predicted that substitution of Asn27 with ala- 
nine would also inhibit binding, owing to loss of a hydrogen bond 
between the ligand 9-O-acetyl carbonyl and the Asn27 side chain 
amide. Using the same solid-phase lectin-interaction assays, we 
show that this substitution resulted in a loss of detectable binding, 
further validating our cryo-EM results (Fig. 3d). 

We subsequently evaluated the importance of the identi- 
fied interactions for HCoV-OC43 S-mediated infectivity using 
pseudotyped G-deficient vesicular stomatitis virus (VSV-AG). 
Substitutions at Asn27, Thr31, Leu80, Lys81, Ser83, Leu86 and 
Trp90 led to complete abrogation of viral entry (Fig. 3e,f), in agree- 
ment with our structural data, biolayer interferometry and solid- 
phase lectin binding assays, as well as the literature*’. These findings 
(i) support the importance of the identified residues for interacting 
with 9-O-Ac-Sia in the context of a full-length, membrane-embed- 
ded, HCoV-OC43 S glycoprotein and (ii) indicate that attachment 
to oligosaccharide receptors using the binding site visualized via 
cryo-EM plays a critical role in promoting HCoV-OC43 S-mediated 
viral entry. 
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Free 9-O-Ac-Sia does not trigger fusogenic conformational 
changes. Comparison of the stabilized apo- and holo- HCoV-OC43 
S glycoprotein structures did not reveal conformational rearrange- 
ments upon binding to 9-O-Ac-Sia (the two structures can be super- 
imposed with a Ca r.m.s.d. of 0.2 A). To validate this finding, we 
investigated the effect of ligand binding to wild-type HCoV-OC43 
S (that is, with a native S,/S, cleavage site sequence) in various bio- 
chemical conditions. Importantly, the HCoV-OC43 S ectodomain 
trimer remained uncleaved after secretion (Supplementary Fig. 4a), 
perhaps owing to the paucity of furin present in the secretory path- 
way of HEK293F cells®'. Incubation of the wild-type HCoV-OC43 
S ectodomain trimer with trypsin at concentrations ranging from 
0.2 to 28ugeml"! (w/v), to recapitulate proteolytic priming'’, led 
to cleavage at the S,-S, boundary, as observed via SDS-PAGE 
(Supplementary Fig. 4a). Incubation with 28ugeml"’ trypsin also 
led to cleavage of a small fraction of S at a second site, yielding a 
band with an apparent molecular weight of ~55 kDa, which could 
be consistent with cleavage at the S,’ site (Supplementary Fig. 4a), 
an event believed to be restricted to fusion triggering upon recep- 
tor engagement for SARS-CoV S"' or MERS-CoV S’***. EM 
analysis of negatively stained samples, however, showed that the 
HCoV-OC43 S trimers remained in the pre-fusion conformation 
and were highly stable, even at the highest trypsin concentration 
tested (Supplementary Fig. 4b). Furthermore, we did not detect 
conformational changes (i) of pre-cleaved wild-type HCoV-OC43 
S incubated with 100mM 9-O-Ac-Me-Sia, (ii) after trypsin 
cleavage of 9-O-Ac-Me-Sia-bound wild-type HCoV-OC43 S or 
(iii) of pre-cleaved wild-type HCoV-OC43 S incubated at pH 4.5 
(Supplementary Fig. 4c-f). Therefore, 9-O-Ac-Me-Sia binding and 
pH acidification of the medium, such as the one occurring in the 
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Fig. 4 | Conservation of the receptor-binding groove among all 9-O-Ac-sialoglycan-recognizing coronaviruses. a-d, Zoomed-in view of the binding sites 
rendered as ribbon diagrams with surrounding residues shown as sticks for HCoV-OC43 (a), BCoV (b), PHEV (¢), HCoV-HKU1 (d). Residues are colored 
by conservation, based on the analysis of all the S glycoprotein sequences available for each virus. In a, the 9-O-Ac-Me-Sia ligand is rendered as sticks 
with atoms colored by elements (carbon, gray; nitrogen, blue; oxygen, red). HCoV-OC43, 192 sequences; BCoV, 150 sequences; PHEV, 12 sequences; 


HCoV-HKU1, 28 sequences. 


endosomal compartment, did not trigger HCoV-OC43 S fusogenic 
conformational changes. 

To evaluate the ability of our purified glycoprotein construct to 
undergo fusogenic conformational changes, we incubated the pre- 
cleaved wild-type HCoV-OC43 S ectodomain at 50°C for 25min 
in absence or presence of isopropanol (used to dissolve the trypsin 
inhibitor added to stop the proteolytic reaction) (Supplementary 
Fig. 4g,h). In the latter conditions, we noticed the formation of 
HCoV-OC43 S rosettes arising from the nonspecific interactions of 
multiple post-fusion trimers via the hydrophobic fusion peptides 
(Supplementary Fig. 4h). These biochemical conditions lowered 
the energy barrier between the metastable pre-fusion state and the 
post-fusion (ground) state, acting as a surrogate for receptor-medi- 
ated fusion activation. This finding indicated that the wild-type 
HCoV-OC43 §S ectodomain trimer could refold to the post-fusion 
conformation, although neither free 9-O-Ac-Me-Sia nor pH acidi- 
fication triggered this transition. It has been previously established 
that caveolin-mediated endocytosis is a major route of HCoV-OC43 
entry into host cells®. Because we demonstrated interactions of 
sialoglycans with the identified site are necessary for S-mediated 
viral entry, we hypothesize that membrane fusion occurs upon for- 
mation of multivalent interactions with sialoglycans (via mechanical 


destabilization of the pre-fusion trimers) and/or binding to a puta- 
tive proteinaceous receptor™, before or after virus internalization. In 
conclusion, 9-O-Ac-Sia-containing receptors appear to differ from 
the proteinaceous SARS-CoV receptor, because addition of mono- 
meric angiotensin-converting enzyme 2 ectodomain to wild-type 
SARS-CoV S trimers, in the presence of trypsin, promoted refold- 
ing to the post-fusion state**”?. 


A conserved sialoside attachment strategy. HCoV-OC43, BCoV, 
PHEV and HCoV-HKU1 are the four coronaviruses known to 
engage 9-O-Ac-Sia-capped sialoglycans to initiate infection of target 
cells. The A domain of their S glycoproteins share strikingly similar 
structures that can be superimposed with a Ca r.m.s.d. between 0.8 
and 2.0 A (Supplementary Fig. 5a—d). 

Virtually all residues participating in interactions with 9-O-Ac- 
Me-Sia or the formation of the binding groove are conserved in 
BCoV S and PHEV S, such as Asn27, Leu80, Lys81, Leu85, Leu86, 
Trp90, Phe95 and Thr31 (Fig. 4a-c). Ser83ycoy-oc43, However, is 
substituted with Thr83,.,v/pyzy and both side chains are expected 
to form a hydrogen bond with the Cl-carboxylate of the ligand 
(Fig. 4a—c). These findings and the abrogation of BCoV and PHEV 
domain A-mediated hemagglutination of rat erythrocytes upon 
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Fig. 5 | Conservation of the receptor-binding site architecture across 
coronavirus S, coronavirus HE and influenza virus HEF glycoproteins. 

a, HCoV-OC43 S bound to 9-O-Ac-Me-Sia. b, BCoV HE bound to 
5-N-acetyl-4,9-di-O-acetyl-neuraminic acid a-methylglycoside (PDB 
3CL5). ¢, Influenza virus C HEF in complex with 9-N-Ac-Sia. In all panels, 
the glycoprotein is rendered as a gray surface with the bound ligand shown 
as sticks. The hydrogen bond formed with the carbonyl of the 9-O/N-acety| 
group is shown by dashed lines. 


substituting Lys81/Thr83 or Trp90 with alanine*’ indicate that these 
two viruses interact with 9-O-Ac-Sia in an identical manner to 
HCoV-O0C43 S. The binding pocket seems to be compatible with 
recognition of 9-O-Ac-Sia and of 9-O-acetyl-glycolyl-neuraminic 
acid. Although the latter saccharide is not found in humans, it is 
present at the termini of oligosaccharides, decorating other mam- 
malian and avian glycoproteins and glycolipids, and could be a 
receptor for BCoV and PHEV. 

Many of the ligand-interacting residues or residues indirectly 
involved in formation of the recognition site identified in the holo- 
HCoV-OC43 S structure are also strictly conserved in HCoV- 
HKU1 S, such as Asn26ycoy-nxur1 (ASN27cov-oca3)> Leu7 9 cov-aKut 
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(Leu80coy-ocas)>  Lys80ucov-nxur (L881 ycov-ocas)>  Leu85 ycov-nxut 
(Leu86ycov-ocas)> Ttp89%xcov-nxur (Irp90 ycov-ocas) and Phe94ycov-nKu1 
(Phe95,:coy-uKxu1) (Fig. 4a,.d), suggesting that HCoV-HKU!1 S inter- 
acts with 9-O-Ac-sialoglycans using the same binding site as that 
identified for HCoV-OC43 S. This hypothesis is supported by 
site-directed mutagenesis experiments showing that substitution 
of Lys80ycov-xup Thr82ycov-nxur (Ser83ycov-oca3) OF TrP89 x cov-nKur 
with alanine abrogated HCoV-HKU1 domain A-mediated hemag- 
glutination of rat erythrocytes”. 

Our results show that all coronaviruses recognizing host cell 
9-O-Ac-sialoglycans share a conserved binding pocket and bind 
to the ligand via virtually identical interactions. Strikingly, BCoV 
HE and influenza HEF similarly interact with 9-O-Ac-Sia, despite 
ample differences in the architecture of their ligand-binding pock- 
ets’’**’, Specifically, the two methyl groups of the ligand are docked 
into two hydrophobic depressions separated by an aromatic amino 
acid side chain, and hydrogen bonds are formed with the 5-nitrogen 
of the neuraminic acid core and the 9-O-acetyl carbonyl (Fig. 5a—c). 
The similarity across the three binding sites is reinforced by the 
observation that 9-O-Ac-Sia buries a comparable surface area at the 
interface with each of these glycoproteins and that the 9-O-acetyl 
moiety makes a major contribution to it in all three cases (~110 A’). 
One notable difference, however, is that the C1 carboxylate anchors 
the ligand to HCoV-OC43 S via a salt bridge and a hydrogen bond, 
whereas it relies on the formation of one or two hydrogen bonds 
with the BCoV HE or influenza HEF lectin domains, respectively. 
These results expand on our previous biochemical work’ to dem- 
onstrate that BCoV HE and influenza HEF use structural principles 
similar to those of other 9-O-Ac-sialoglycan-recognizing human 
and animal coronaviruses for engagement to host cell receptors. 


Discussion 

We structurally identified and characterized with unprecedented 
detail the HCoV-OC43 S sialoglycan-binding site, which is located 
in a groove at the surface of domain A. This site is conserved in 
all other coronaviruses known to attach to 9-O-Ac-Sia, including 
HCoV-HKUI1 S (another endemic human coronavirus), and BCoV 
S (the presumptive zoonotic ancestor of HCoV-OC43). Our results 
provide a molecular framework explaining the specific recognition 
of 9-O-Ac-Sia-decorated oligosaccharides present at the surface of 
host cells targeted by these viruses. The B-sandwich architecture of 
domain A is conserved among all coronaviruses, and some of them 
feature a duplication of this domain at the S glycoprotein N-terminal 
region'®°. Other coronaviruses like MERS-CoV (f-coronavirus), 
infectious bronchitis virus (IBV, y-coronavirus), porcine epidemic 
diarrhea virus (a-coronavirus) and transmissible gastroenteritis 
virus (a-coronavirus) have been described to also bind to sialogly- 
cans (distinct from 9-O-Ac-sialosides) via their A domains during 
host cell infection*”**°°. The ligand-binding pocket identified in the 
holo-HCoV-OC43 S structure is not conserved in the MERS-CoV 
or in the IBV A domains, for which structures are available, suggest- 
ing that host attachment of this subset of viruses involve different 
interactions. The conserved topology of domain A among corona- 
virus S glycoproteins indicate that it derived from divergent evolu- 
tion of an ancestral galectin domain. Viral evolution and adaptation 
thus lead to the use of distinct binding residues on the same domain 
putatively to acquire different ligand specificities such as 9-O-Ac- 
sialosides versus non-O-acetylated-sialoglycans. This evolutionary 
plasticity is reminiscent of what has been described for the BCoV 
HE lectin domain in comparison with influenza A/B hemagglutinin 
and influenza C/D HEF’. 

Sialic acids cap numerous oligosaccharides found at the surface 
of eukaryotic cells and constitute an important class of receptors for 
several human pathogens*’’””**. Modulation of attachment to sialo- 
glycans can therefore have profound effects on zoonotic transmis- 
sion, tropism and virulence of many viruses. For instance, a single 
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point mutation in the highly pathogenic H5N1 avian influenza 
virus hemagglutinin was proposed to account for most of the pref- 
erence switch from avian enteric tract receptors («2,3-linked sialic 
acid) to human respiratory tract receptors («2,6-linked sialic acid)”’. 
Although influenza A/B hemagglutinin, influenza C/D HEF and 
coronavirus HE have distinct architectures compared with those of 
coronavirus S glycoproteins, common rules of ligand engagement 
emerge. These rules also appear to extend to the interactions of sialo- 
glycans with adenoviruses” and reoviruses™. In all cases, sialic acid 
binding involves burying a small surface area (300-400 A’) through 
contacts with a solvent-exposed groove of the protein. One face of 
the sialic acid ligand makes extensive interactions with the viral pro- 
teins, whereas the opposite, solvent-exposed face, makes few con- 
tacts. The binding affinity for sialic acids usually ranges between the 
micromolar and millimolar range, and the aforementioned viruses 
display numerous oligomeric spikes to enhance adsorption to target 
receptors through avidity”. 

Despite these similarities, marked differences in the 3D organi- 
zation of the binding sites explain the selectivity of different viruses 
for unmodified or modified sialic acids. The ligand-binding sites 
of BCoV HE, influenza HEF and a subset of coronavirus S gly- 
coproteins have evolved to specifically recognize 9-O-Ac-Sia via 
hydrogen bonding with the 9-O-acetyl carbonyl moiety and forma- 
tion of a hydrophobic pocket accommodating the 9-O-acetyl met 
hyl?****, In contrast, influenza hemagglutinin cannot accommo- 
date 9-O-acetylated neuraminic acids, owing to steric restrictions, 
but a subset of hemagglutinins can bind to N-glycolyl neuraminic 
acids’. The HCoV-OC43 S, HCoV-HKU1 S, BCoV S and PHEV S 
glycoproteins therefore share the ligand specificity of influenza C/D 
HEF but are functionally more similar to influenza A/B hemagglu- 
tinin, by carrying receptor attachment and membrane fusion func- 
tions, whereas a dedicated HE (coronaviruses) or neuraminidase 
(influenza A/B) is responsible for the receptor-destroying activity. 
In conclusion, our results illuminate how coronaviruses recognize 
9-O-Ac-sialosides to enable infection of susceptible cells and show 
that a conserved strategy is utilized to engage such ligands across 
coronaviruses and orthomyxoviruses. 
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Methods 

Construct design. The fragment encoding the HCoV-OC43 S ectodomain 
(residues 15-1263, UniProtKB: Q696P8) was amplified by (RT-)PCR from the 
viral genome and placed into a modified pCAGGS mammalian expression vector 
with a CD5 N-terminal signal peptide (MPMGSLQPLATLYLLGMLVASVLA) 
and an engineered C-terminal extension encoding a GCN4 trimerization 

motif (IKRMKQIEDKIEEIESKQKKIENEIARIKKIK), a thrombin cleavage site 
(underlined) (LVPRGSLE), and an eight-residue Strep-tag (WSHPQFEK) followed 
by a stop codon, as previously described”'*'’. This construct results in fusing the 
GCN4 trimerization motif in register with the HR2 helix at the C-terminal end 

of the HCoV-OC43 S-encoding ectodomain sequence. A mutant gene carrying 
three R-to-G amino acid mutations to abolish the furin cleavage (754-RRSRG-758 
— 754-GGSGG-758) at the S1-S2 junction (S2 cleavage site) was also generated 
following the same strategy. A pCAGGS vector encoding the HCoV-OC43 S 
domain A (residues 1-306) C-terminally extended with a thrombin cleavage site 
followed by the Fc region of human IgG was generated as described previously”. 


Protein expression and purification. HEK293F cells were grown in suspension 
using FreeStyle 293 Expression Medium (Life Technologies) at 37 °C ina 
humidified 8% CO, incubator rotating at 130 r.p.m. Wild-type or mutant HCoV- 
OC43 S ectodomain construct were transfected into 250 ml cultures with cells 
grown to a density of 1 million cells per milliliter using 293fectin (ThermoFisher 
Scientific). After 4 d, supernatant was collected, and cells were kept in culture for 
an additional 4 d, yielding two harvests per transfection. Recombinant wild-type or 
mutant HCoV-OC43 S ectodomain was purified from clarified supernatants using 
a 1 ml StrepTrap column (GE Healthcare). Purified proteins were concentrated 
and flash frozen in Tris-saline buffer (20 mM Tris, pH 8.0, 150 mM NaCl) prior to 
negative staining and cryo-EM analysis. 


Negative stain electron microscopy. Protein samples were adsorbed to glow- 
discharged carbon-coated copper grids for 30s prior to 2% uranyl formate 
staining. Micrographs were recorded using the Leginon software’' on a 120kV 

FEI Tecnai G2 Spirit with a Gatan Ultrascan 4000 CCD camera at 67,000 nominal 
magnification. The defocus ranged from 1.0 to 2.0 um, and the pixel size was 1.6 A. 


Conformational change analysis using negative-staining electron microscopy 
and SDS-PAGE. Wild-type HCoV-OC43 S ectodomain trimer at 1 mgeml]"! 
(6.6u.M spike monomer) was digested or not with trypsin at 14ugeml"' for 30 min 
at room temperature, after which 1.5mM PMSF was added to stop the reaction. 
The samples were subsequently incubated: either overnight at 4°C with 100 mM 
9-O-Ac-sia; 25 min at 50°C; or 30 min at pH 4.5 using 20mM sodium citrate buffer 
before being analyzed via negative-staining EM and SDS-PAGE. 


Cryo-EM sample preparation and data collection. Three microliters of HCoV- 
OC43 S at 1 mgeml"' was applied to a 2/2 C-flat grid (Protochips) that had been 
glow discharged for 30s at 20mA. After preferential orientation was observed, 

2.7 ul of HCoV-OC43 S at 10 mgeml"! was mixed with 0.3 ul of n-Octyl-B-D- 
glucopyranoside (OG) 180 mM immediately before being applied to a glow- 
discharged grid. Thereafter, grids were plunge frozen in liquid ethane using an 
FEI Mark IV Vitrobot with a 6.5-7.5s blot time at 100% humidity and 20°C. 
Incubation of 1.1 uM HCoV-OC43 S with 100 mM 9-O-acetylated sialic acid 
(9-O-Ac-sia) was performed overnight at 4 °C, and immediately before vitrification, 
OG was added to the mixture reaction at a final concentration of 18 mM. Data 
were acquired using an FEI Titan Krios transmission electron microscope 
operated at 300kV and equipped with a Gatan K2 Summit direct detector and 
Gatan Quantum GIF energy filter, operated in zero-loss mode with a slit width 

of 20 eV. Automated data collection was carried out using Leginon” at a nominal 
magnification of 130,000x with a pixel size of 0.525 A for apo-HCoV-OC43 S 
(super-resolution mode) and 1.05 A for holo-HCoV-OC43 S (counted mode). The 
dose rate was adjusted to 8 counts/pixel/s, and each movie was dose-fractionated 
in 50 (apo) or 60 (holo) frames of 200 ms. A total of 2,211 and 2,402 micrographs 
were respectively collected for apo- and holo- HCoV-OC43 §S, with a defocus range 
between 1.3 and 1.8 um. 


Cryo-EM data processing. Movie frame alignment, estimation of the microscope 
contrast-transfer function parameters, particle picking and extraction were carried 
out using Warp”. Particle images were extracted with a box size of 800 binned 

to 400 for apo- HCoV-OC43 S or with a box size of 400 for holo- HCoV-OC43 S, 
both yielding a pixel size of 1.05 A. Reference-free 2D classification in Relion was 
used to parse particles from the original 197,791 and 332,912 for apo- and holo- 
HCoV-OC43 S, respectively. The MHV S cryo-EM map’ was used to generate 

an initial model of apo-HCoV-OC43 S. The initial model of holo-HCoV-OC43 

S was generated using the apo-HCoV-OC43 S map. Relion 3D classification 
without symmetry was used to select ~83,000 and ~178,000 particles from 

apo- and holo- HCoV-OC43 §, respectively. CTF refinement in Relion3.0 (ref. 

”) was used to refine per-particle defocus values. Particle images were subjected 
to the Bayesian polishing procedure implemented in Relion3.0 (refs. ”»”*) before 
performing another round of per-particle defocus refinement. The particles were 
then subjected to 3D classification without refining angles and shifts using the 


same soft mask as that used during 3D refinement and with a tau value of 30. 
Final 3D refinement of the apo- and holo- HCoV-OC43 S datasets imposing C3 
symmetry was carried out using non-uniform refinement in cryoSPARC” and 
yielded reconstructions at 2.9- and 2.8-A resolution, respectively. Local resolution 
estimation, filtering and sharpening was carried out using CryoSPARC. Reported 
resolutions are based on the gold-standard Fourier shell correlation (FSC) of 0.143 
criterion”, and FSC curves were corrected for the effects of soft masking by high- 
resolution noise substitution”. 


Cryo-EM model building and analysis. UCSF Chimera” and Coot” were used 

to fit the MHV atomic model (PDB 3JCL) into the holo- HCoV-OC43 S cryo-EM 
map. The models were subsequently manually rebuilt using Coot”. N-linked 
glycans were hand built into the density where visible, and the models were rebuilt 
and refined using Rosetta*’*’. Models were analyzed using MolProbity™, Privateer® 
and PISA**. Figures were generated using UCSF Chimera”® and ChimeraX*’. 
Analysis of the ligand-binding site electrostatic surface potential was performed 
using PDB 2PQR* and APBS”’. 


Biolayer interferometry. HCoV-OC43 S1A-Fc was expressed in HEK293T cells 
and purified from the cell culture supernatant by protein A chromatography, 

as described*’. Monomeric domain A, wild type or with a W90A substitution, 

was subsequently obtained by on-the-bead thrombin cleavage’’, after which the 
proteins were concentrated to up to 3.8 mgeml"’ in PBS, aliquoted and stored at 
—80°C until further use. Biolayer interferometry analysis was performed on an 
Octet RED384 machine. All assays were performed using Fortebio Kinetics Buffer 
(KB; PBS supplemented with 0.1% BSA, 0.02% Tween20 and 0.05% sodium azide) 
at 30 °C. Synthetic biotinylated 6-sialyl-5-N-,9-O-acetyl-lactosamine (9OAc6SLN) 
or 6-sialyl-5-N-acetyl-lactosamine (6SLN) dissolved to 100nM were loaded onto 
streptavidin (SA) biosensors to maximum loading levels (until no further increase 
in reflection was observed). Sensors were washed in KB until a stable baseline 

was obtained. Binding of monomeric HCoV-OC43 S domain A was performed 
by moving receptor-loaded sensors to wells containing 100 pl of purified protein, 
dissolved in KB to various concentrations, for up to 3 min, then dissociating for 

3 min dissociation. To abolish unspecific binding, sensors were subjected to five 
successive association/dissociation cycles. To test whether binding of domain A 
was sialate-9-O-acetyl-dependent, biosensors loaded with 9OAc6SLN were de- 
O-acetylated by dipping them in wells containing 20 pgeml"' porcine torovirus 

P4 HE-Fc* in KB for 30 min, then washing prior to association/dissociation (pre- 
HE) or after a cycle of association/dissociation, upon which the biosensors were 
subjected to a final cycle (post-HE). The equilibrium dissociation constant, K,, was 
determined from three independent experiments with the ‘Response’ option of the 
Octet Data Analysis software. The half-life of the domain A-9OAc6SLN complex 
was calculated manually from the dissociation curves. 


Pseudovirus entry assays. HCoV-OC43 S—pseudotyped VSV-AG particles were 
prepared as previously described”. Briefly, HEK293T cells at 70% confluency 
were transfected with PEI-complexed plasmid DNA. For coexpression of HCoV- 
OC43 S and BCoV HE-Fc, S expression vectors and pCD5-BCoV HE-Fc were 
mixed at molar ratios of 8:1. At 48h after transfection, cells were transduced with 
VSV-G-pseudotyped VSVAG/Fluc” at a multiplicity of infection of 1. Cell-free 
supernatants were harvested at 24h after transduction and filtered through 0.45- 
um membranes, and virus particles were purified and concentrated via sucrose 
cushion ultracentrifugation at approximately 100,000g for 3h. Pelleted virions 
were resuspended in PBS and stored at —80°C until further use. Inoculation of 
HRT18 monolayers in 96-well format was performed with equal amounts of 
S-pseudotyped virions, as calculated from VSV-N content (roughly corresponding 
to the yield from 2 x 10° transfected and transduced cells), diluted in 10% FBS- 
supplemented DMEM. At 18h post infection, cells were lysed using passive 

lysis buffer (Promega). Firefly luciferase expression was measured using a firefly 
luciferase assay system. Infection experiments were performed independently in 
triplicate, each time with three technical replicates. Pseudovirus incorporation of 
flag-tagged OC43 S was determined for the parental type and each of the mutants 
via Western blotting and by calculating the S content (measured with monoclonal 
antibody ANTI-FLAG M2; Sigma) relative to that of VSV-N (measured with anti- 
VSV-N monoclonal antibody 10G4; Kerafast). 


Reporting Summary. Further information on research design is available in the 
Nature Research Reporting Summary linked to this article. 


Data availability 

The cryo-EM maps and atomic models have been deposited in the Electron 
Microscopy Data Bank and the Protein Data Bank with accession codes EMD-0557 
and PDB ID 6NZK (holo-HCoV-OC43 S) and EMD-20070 and PDB ID 60HW 
(apo-HCoV-OC43 S). 
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